More information about the change, at the trade-off of needing to take repeated measurments.
I think this is the same as split plot in time… The model is
\[ Y_{ij} = X_{ij}'\beta + b_i + e_{ij} \] The covariance structure is compound symmetry, meaning
\[ \begin{aligned} \text{Cov}(Y_{i}) = \begin{bmatrix} \sigma_b + \sigma_e & \sigma_e & \dots &\sigma_e \\ \sigma_e & \sigma_b + \sigma_e & \dots &\sigma_e \\ \vdots & \vdots & \ddots & \vdots \\ \sigma_e & \sigma_e & \dots & \sigma_b + \sigma_e \\ \end{bmatrix} \end{aligned} \]
Special case of the so called profile analysis. The main idea for MANOVA is to make some trasnformations and make some derived response varaibles to analyze. The multiple comes from having multiple responses. One is to make the sum of all the responses, to examine average over time. You could also create a variable for linear change across time, or quadratic change within subject.
There are some disadvantages though,
Reduce the sequence of each individual to a small set of summary values. Then, you can use the classic t-test or ANOVA, univariate tests.
Drawbacks
Likelihood test requires an additional fit on the null hypothesis, but better properties. Recommend Likelihood ratio based tests and CI. Note the RMLE is a correction for the data estimating both the mean and covariance.
The short hand notation for the observations of a single individual are:
\[ \begin{aligned} Y_{i} = X_{i}\beta + \varepsilon_{i} \end{aligned} \] with \(\varepsilon_i \sim N(0, \Sigma)\), which expanded would look like,
\[ \begin{aligned} \begin{bmatrix} y_{i1} \\ y_{i2} \\ \vdots \\ y_{in_{i}} \\ \end{bmatrix} = \begin{bmatrix} X_{i11} & X_{i12} & \dots &X_{i1p} \\ X_{i21} & X_{i22} & \dots &X_{i2p} \\ \vdots & \vdots & \ddots & \vdots \\ X_{in_{i}1} & X_{in_{i}2} & \dots & X_{in_{i}p} \\ \end{bmatrix} \begin{bmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_p \end{bmatrix} + \begin{bmatrix} \varepsilon_{i1} \\ \varepsilon_{i2} \\ \vdots \\ \varepsilon_{in_{i}} \end{bmatrix} \end{aligned} \]
In total, we would stack them, so we have
\[ \begin{aligned} \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_N \\ \end{bmatrix} = \begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_N \\ \end{bmatrix} \begin{bmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_p \end{bmatrix} + \begin{bmatrix} \mathbf{\varepsilon_{1}} \\ \mathbf{\varepsilon_{2}} \\ \vdots \\ \mathbf{\varepsilon_{N}} \end{bmatrix} \end{aligned} \] Thus, \(\varepsilon\) has a block diagonal structure with each subject having the same covariance matrix.
This section follows 5.5 in Fitzmaurice, Laird, and Ware (2011).
The main questions to ask in this model are
There are primarily two ways of studying the longitudinal responses
This is an exploratory plot summarizing the eventual statements we’d like to say about the model.
tlc_raw <- read.csv("data/tlc.csv", header=FALSE)
names(tlc_raw) <- c("id", "trt","w0", "w1", "w4","w6")
tlc <- tlc_raw %>% gather("week", "lead", 3:6)
# numeric version of week
tlc$week_int <- tlc$week %>% gsub(".*([0-9]+).*", "\\1", .) %>% as.numeric()
# timepoint (1:4)
tlc$time <- c(1:4)[as.factor(tlc$week)]
# change from baseline
tlc_diff <- tlc %>% group_by(id) %>%
arrange(trt, id, time) %>%
mutate(lead_diff = lead - lead[1],
lead_base = lead[1],
trt = factor(trt, levels = c("P", "A"))) %>%
arrange(desc(trt), id, time) %>%
filter(week != "w0")
tlc %>% ggplot(aes(x = week, y = lead, group = id)) +
geom_line(alpha = .2) +
facet_wrap(~trt)
# Mean Response Profile by Trt
tlc_mrp <- tlc %>%
group_by(trt, week) %>%
summarise(mlead = mean(lead),
.groups = "drop_last")
tlc_mrp %>%
ggplot(aes(x = week, y=mlead, group=trt)) +
geom_line(aes(linetype=trt)) +
geom_point() +
ggtitle("Lead over time") +
theme_classic()
* TLC data analysis example, from chapter 5;
* https://content.sph.harvard.edu/fitzmaur/ala2e/;
DATA tlc_long;
INFILE "~/lda/tlc-data.txt" DLM=" ";
input id group $ lead0 lead1 lead4 lead6;
* create 4 observations from each row;
y=lead0; time=0; output;
y=lead1; time=1; output;
y=lead4; time=4; output;
y=lead6; time=6; output;
drop lead0 lead1 lead4 lead6; * drop original "wide" data columns;
run;
/* set reference level */
/* http://support.sas.com/kb/37/108.html */
proc mixed noclprint=10 data=tlc_long
class id group(ref="P") time(ref="0");
model y = group time group*time / s chisq;
repeated time / type=un subject=id r;
lsmeans group / cl diff;
run;
We fit an unstructured covariance, with fully crossed time and
treatment factors. In order to get the unstructured covariance matrix,
we must use gls, for generalized least squares. This is
basically
tlc$trt <- factor(tlc$trt, levels = c("P", "A")) # use P as the reference level
# Generalized Least Squares, defaults to REML
tlc_gls <- gls(lead ~ trt*week,
corr=corSymm(form = ~ time | id),
weights = varIdent(form = ~ 1 | week),
data=tlc)
# Table 5.5 in Fitzmaurice
summary(tlc_gls)
## Generalized least squares fit by REML
## Model: lead ~ trt * week
## Data: tlc
## AIC BIC logLik
## 2452.076 2523.559 -1208.038
##
## Correlation Structure: General
## Formula: ~time | id
## Parameter estimate(s):
## Correlation:
## 1 2 3
## 2 0.571
## 3 0.570 0.775
## 4 0.577 0.582 0.581
## Variance function:
## Structure: Different standard deviations per stratum
## Formula: ~1 | week
## Parameter estimates:
## w0 w1 w4 w6
## 1.000000 1.325880 1.370442 1.524813
##
## Coefficients:
## Value Std.Error t-value p-value
## (Intercept) 26.272 0.7102929 36.98756 0.0000
## trtA 0.268 1.0045059 0.26680 0.7898
## weekw1 -1.612 0.7919199 -2.03556 0.0425
## weekw4 -2.202 0.8149021 -2.70217 0.0072
## weekw6 -2.626 0.8885253 -2.95546 0.0033
## trtA:weekw1 -11.406 1.1199438 -10.18444 0.0000
## trtA:weekw4 -8.824 1.1524456 -7.65676 0.0000
## trtA:weekw6 -3.152 1.2565645 -2.50843 0.0125
##
## Correlation:
## (Intr) trtA weekw1 weekw4 weekw6 trtA:1 trtA:4
## trtA -0.707
## weekw1 -0.218 0.154
## weekw4 -0.191 0.135 0.680
## weekw6 -0.096 0.068 0.386 0.385
## trtA:weekw1 0.154 -0.218 -0.707 -0.481 -0.273
## trtA:weekw4 0.135 -0.191 -0.481 -0.707 -0.272 0.680
## trtA:weekw6 0.068 -0.096 -0.273 -0.272 -0.707 0.386 0.385
##
## Standardized residuals:
## Min Q1 Med Q3 Max
## -2.1756456 -0.6849980 -0.1515547 0.5294176 5.6327571
##
## Residual standard error: 5.02253
## Degrees of freedom: 400 total; 392 residual
The estimated (unstructured), marginal covariance structure can be
extracted by getVarCov.
getVarCov(tlc_gls) # covariance matrix
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 25.226 19.107 19.700 22.202
## [2,] 19.107 44.346 35.535 29.675
## [3,] 19.700 35.535 47.377 30.620
## [4,] 22.202 29.675 30.620 58.651
## Standard Deviations: 5.0225 6.6593 6.8831 7.6584
Getting the type III fixed effect tests, similar to SAS, is a little more work. There must be an easier way, but this shows how to do it manually. Here P0 denotes the mean of placebo at time 0.
| Conditional Mean (\(\mu\)) | Coef |
|---|---|
| P0 | \(\beta_1\) |
| P1 | \(\beta_1 + \beta_3\) |
| P4 | \(\beta_1 + \beta_4\) |
| P6 | \(\beta_1 + \beta_5\) |
| S0 | \(\beta_1 + \beta_2\) |
| S1 | \(\beta_1 + \beta_2 + \beta_3 + \beta_6\) |
| S4 | \(\beta_1 + \beta_2 + \beta_4 + \beta_7\) |
| S7 | \(\beta_1 + \beta_2 + \beta_5 + \beta_8\) |
Furthermore, the Wald test statistic is
\[ \begin{aligned} W^2 = (L\hat\beta)'\{L\mathrm{Cov}(\hat\beta)L'\}^{-1}L\hat\beta \end{aligned} \] Compare the following table to the SAS Type 3 tests for Fixed Effects
# variance of estimated coef, beta hat
covbeta <- tlc_gls$varBeta
beta <- coef(tlc_gls)
# testing interactions together, manually
trt_week_coef <- beta[6:8]
trt_week_cov <- covbeta[6:8, 6:8]
# t(trt_week_coef) %*% solve(trt_week_cov) %*% trt_week_coef
# avg treatment - avg control, H0: S = P
trt_L <- c(4, 4, 1, 1, 1, 1, 1, 1) - c(4, 0, 1, 1, 1, 0, 0, 0) %>% matrix() %>% t()# c(0,4,0,0,0,1,1,1)
# manually, its a scalar
# trt_L[c(2, 6:8)] %*% beta[c(2, 6:8)] *
# solve(trt_L[,c(2, 6:8)] %*% covbeta[c(2, 6:8), c(2, 6:8)] %*% cbind(trt_L[,c(2, 6:8)])) *
# trt_L[c(2, 6:8)] %*% beta[c(2, 6:8)]
# average by week, H0: w0 = w1 = w4 = w6
w0_L <- c(2, 1, 0, 0, 0, 0, 0, 0)
w1_L <- c(2, 1, 2, 0, 0, 1, 0, 0)
w4_L <- c(2, 1, 0, 2, 0, 0, 1, 0)
w6_L <- c(2, 1, 0, 0, 2, 0, 0, 1)
week_L <- matrix(c(w1_L - w0_L,
w4_L - w0_L,
w6_L - w0_L), nrow = 3, byrow=T)
# Combine the code above
rbind(anova(tlc_gls, L = trt_L), # Wald F value = 25.43
anova(tlc_gls, L = week_L), # Wald F value = 61.49
anova(tlc_gls, Terms = 4)) %>% # Wald F value = 35.9
as.data.frame() %>%
dplyr::mutate(Chisq = numDF * `F-value`,
.after = `F-value`) %>%
add_column(source = c("Trt", "Week", "Trt x Week"), .before = 1)
I’m not entirely sure what’s going on in the anova function in nlme. There is a value for intercept and treatment, and I’m not sure how to interpret that.
It seems the week and interaction are being calculated correctly, not sure why the treatment is different from the manual calculations above. In the presence of an interaction effect however, it doesn’t really matter.
Also notice that Chisq statistic is just the F statistic * ndf. Assuming chisq gives more liberal estimates, effectively infinite residual degrees of freedom.
The following are two anova tables of the same model,
# these results match
anova(tlc_gls, type = "marginal") # nlme
## Denom. DF: 392
## numDF F-value p-value
## (Intercept) 1 1368.0793 <.0001
## trt 1 0.0712 0.7898
## week 3 3.8731 0.0095
## trt:week 3 35.9293 <.0001
Anova(tlc_gls, "III", test.statistic = "F", error.df = tlc_gls$dims$N - tlc_gls$dims$p) # car w/ 392 error df
## Analysis of Deviance Table (Type III tests)
##
## Response: lead
## Df F Pr(>F)
## (Intercept) 1 1368.0793 < 2e-16 ***
## trt 1 0.0712 0.78977
## week 3 3.8731 0.00946 **
## trt:week 3 35.9293 < 2e-16 ***
## Residuals 392
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# more liberal
Anova(tlc_gls, "III", test.statistic = "Chisq")
## Analysis of Deviance Table (Type III tests)
##
## Response: lead
## Df Chisq Pr(>Chisq)
## (Intercept) 1 1368.0793 < 2.2e-16 ***
## trt 1 0.0712 0.789625
## week 3 11.6192 0.008808 **
## trt:week 3 107.7880 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# not sure what's going on here
anova(tlc_gls, type = "sequential")
For comparison, we can see that gls gives the same estimates as the gls model, for profile curves, it basically goes through the mean of each group. lm is NOT the right way to do this analysis, but we want to see what it shows for comparison.
tlc_lm <- lm(lead ~ trt*week, data = tlc)
cbind("gls" = coef(tlc_gls), "lm" = coef(tlc_lm))
## gls lm
## (Intercept) 26.272 26.272
## trtA 0.268 0.268
## weekw1 -1.612 -1.612
## weekw4 -2.202 -2.202
## weekw6 -2.626 -2.626
## trtA:weekw1 -11.406 -11.406
## trtA:weekw4 -8.824 -8.824
## trtA:weekw6 -3.152 -3.152
compare the standard errors of the estimates though
cbind("gls" = sqrt(diag(as.matrix(tlc_gls$varBeta))),
"lm" = tlc_lm %>% tidy() %>% pull(std.error))
## gls lm
## (Intercept) 0.7102929 0.9370175
## trtA 1.0045059 1.3251428
## weekw1 0.7919199 1.3251428
## weekw4 0.8149021 1.3251428
## weekw6 0.8885253 1.3251428
## trtA:weekw1 1.1199438 1.8740349
## trtA:weekw4 1.1524456 1.8740349
## trtA:weekw6 1.2565645 1.8740349
not quite sure this is the fairest comparison, but the heterogeneity in the estimated variance seems to have stabilized the standardized residuals. Need to double check if this is the right stabilization.
plot(tlc_gls, resid(., type = "pearson") ~ week_int) # show against time (instead of default fitted)
plot(tlc$week_int, rstudent(tlc_lm))
A third strategy is to do summary statistics of the curves. The section in the book refers to creating 1 df tests, which are more powerful than the overall F test of trt x time interaction. Often times, if you measure many many time points, the overall f test becomes diluted and it becomes more difficult to detect differences in the two curves, so these are designed to have more power than those tests. should only be specified prior to analysis though, to keep with proper significance control
We consider two tests,
\[ \begin{aligned} L &= (L_1, -L_1) \\ L_1 &= \left(-1, \frac{1}{n-1}, \frac{1}{n-1}, \dots, \frac{1}{n-1}\right) \end{aligned} \]
both of these can be formulated as a contrast, we’ll use emmeans for convenience here.
emm_tlc_gls <- emmeans(tlc_gls, specs=c("week", "trt"))
## Analytical Satterthwaite method not available; using appx-satterthwaite
emm_tlc_gls
## week trt emmean SE df lower.CL upper.CL
## w0 P 26.3 0.710 97.3 24.9 27.7
## w1 P 24.7 0.942 98.4 22.8 26.5
## w4 P 24.1 0.973 98.5 22.1 26.0
## w6 P 23.6 1.083 98.3 21.5 25.8
## w0 A 26.5 0.710 97.3 25.1 27.9
## w1 A 13.5 0.942 98.4 11.7 15.4
## w4 A 15.5 0.973 98.5 13.6 17.4
## w6 A 20.8 1.083 98.3 18.6 22.9
##
## Degrees-of-freedom method: appx-satterthwaite
## Confidence level used: 0.95
avg_minus_baseline_L <- c(-1, 1/3, 1/3, 1/3, 1, -1/3, -1/3, -1/3)
auc_minus_baseline_L <- c(5.5, -2, -2.5, -1, -5.5, 2, 2.5, 1)
contrast(emm_tlc_gls, list("avg minus baseline" = avg_minus_baseline_L,
"AUC minus baseline" = auc_minus_baseline_L))
## contrast estimate SE df t.ratio p.value
## avg minus baseline 7.79 0.95 98 8.205 <.0001
## AUC minus baseline -48.02 5.35 98 -8.973 <.0001
##
## Degrees-of-freedom method: appx-satterthwaite
anova(tlc_gls, L = avg_minus_baseline_L)
## Denom. DF: 392
## F-test for linear combination(s)
## (Intercept) trtA weekw1 weekw4 weekw6 trtA:weekw1
## -1.0000000 0.3333333 0.3333333 0.3333333 1.0000000 -0.3333333
## trtA:weekw4 trtA:weekw6
## -0.3333333 -0.3333333
## numDF F-value p-value
## 1 1 91.2127 <.0001
The following example is from @faraway_extending_2016
data(ratdrink)
ratdrink %>% ggplot(aes(weeks, wt, group = subject)) +
geom_line() +
facet_wrap(~treat)
# full lm
rat_lm <- lm(wt ~ weeks*treat + subject, data = ratdrink) # fixed "block"
rat_mmer <- lmer(wt ~ weeks*treat + (1|subject), data = ratdrink) # random "block"
# plotting the predictions
rat_preds <- ratdrink %>% add_column(lm_yhat = predict(rat_lm),
mmer_yhat = predict(rat_mmer))
rat_preds %>% pivot_longer(c(wt, lm_yhat, mmer_yhat), names_to = "response", values_to = "y") %>%
ggplot() +
geom_line(aes(weeks, y, groups = subject)) +
facet_grid(response~treat)
The bottom row is the raw data while, the second row is the predictions from a random subject, and finally the top row is the predictions from the fully fixed model.
rat_lm_diffslope <- lm(wt~weeks*treat + subject + subject:weeks, data = ratdrink)
# rat_mmer_interaction <- lmer(wt ~ weeks*treat + (1|weeks:subject), data = ratdrink) # weeks is continuous! doesn't work.... too many random effect intercepts
rat_mmer_interaction <- lmer(wt ~ weeks*treat + (1|subject) + subject:weeks, data = ratdrink) # weird model....b/c subject:weeks is fixed not sure when
rat_mmer_random_slope <- lmer(wt ~ weeks*treat + (weeks|subject), data = ratdrink)
rat_mmer_random_slope_nocor <- lmer(wt ~ weeks*treat + (1 | subject) + (0 + weeks || subject), data = ratdrink) # without correlation between intercept and slope
rat_preds_slope <- ratdrink %>% add_column(lm_diffslope_yhat = predict(rat_lm_diffslope),
mmer_interaction_yhat = predict(rat_mmer_interaction),
mmer_random_slope_yhat = predict(rat_mmer_random_slope),
mmer_random_slope_nocor_yhat = predict(rat_mmer_random_slope_nocor))
rat_preds_slope %>%
pivot_longer(c(wt,
lm_diffslope_yhat,
mmer_interaction_yhat,
mmer_random_slope_yhat,
mmer_random_slope_nocor_yhat), names_to = "response", values_to = "y") %>%
ggplot() +
geom_line(aes(weeks, y, group = subject)) +
facet_grid(response~treat)
# profile conf intervals
confint(rat_mmer_random_slope)
## Computing profile confidence intervals ...
## 2.5 % 97.5 %
## .sig01 3.4506344 7.6654748
## .sig02 -0.5261030 0.3794203
## .sig03 2.6064121 4.8687653
## .sigma 3.7555591 5.1142342
## (Intercept) 48.8692859 56.8907140
## weeks 24.0547424 28.9052555
## treatthiouracil -0.8920062 10.4520061
## treatthyroxine -7.0445321 5.4559606
## weeks:treatthiouracil -12.7998322 -5.9401707
## weeks:treatthyroxine -3.1166339 4.4423449
We use the MIT growth study menarche example for this section, from Fitzmaurice website
Covariates/Response:
fat <- read.dta("data/fat.dta")
# select 20 random girls and show response curve
fat %>%
group_nest(id) %>%
slice_sample(n=20) %>%
unnest(data) %>%
ggplot(aes(time, pbf)) +
geom_point() +
geom_line() +
facet_wrap(~id) +
geom_vline(xintercept = 0, color = "red")
Fitzmaurice, Laird, and Ware (2011) in chapter 8.8 analyzes this as a piecewise random effects model.
# to fit piecewise function, need variable with after menarche time
fat_post <- fat %>% mutate(timepost = time * (time > 0)) # create variable for post menarche time
fat_lme <- lme(pbf ~ time + timepost,
random = ~time + timepost | id,
data = fat_post)
The fixed effects of the model:
# Fixed effects
fat_lme %>% tidy() %>% filter(effect == "fixed") %>%
dplyr::select(term, estimate, std.error, statistic, df, p.value)
## # A tibble: 3 × 6
## term estimate std.error statistic df p.value
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 21.4 0.565 37.8 885 3.95e-187
## 2 time 0.417 0.157 2.65 885 8.09e- 3
## 3 timepost 2.05 0.228 8.98 885 1.60e- 18
The random effects and standard errors calculated from inverse
expected fisher information (with package lmeInfo)
# https://stats.oarc.ucla.edu/r/faq/how-can-i-calculate-standard-errors-for-variance-components-from-mixed-models/
# Random effects and standard errors
fat_varcomp <- getVarCov(fat_lme)[upper.tri(getVarCov(fat_lme), diag = TRUE)]
cbind(estimate = fat_varcomp,
`std. err` = sqrt(diag(varcomp_vcov(fat_lme)))[c(1, 2, 3, 4, 5, 6)])
## estimate std. err
## Tau.id.var((Intercept)) 45.939153 5.7556122
## Tau.id.cov(time,(Intercept)) 2.526047 1.1984610
## Tau.id.var(time) 1.630971 0.4023225
## Tau.id.cov(timepost,(Intercept)) -6.109078 1.8486864
## Tau.id.cov(timepost,time) -1.750363 0.5746242
## Tau.id.var(timepost) 2.749384 0.9273656
There’s - strucchange::gefp - structural change
testing.
We should note that there are econometric approaches, such as sandwich estimators, and mixed model approaches that are all competing in the space of modeling the variance. Broadly there are three approaches for variance modeling:
Packages with this approach from the sandwich vignette
# str(spruce88)
# xtabs(~day + tx, data=spruce88)
# xtabs(~tx,data = spruce88)
# Example of selecting out some groups of elements for longitudinal analysis. Used above for highlighting
spruce88 %>% filter(id %in% sample(levels(spruce88[,"id"])))
## [1] y day tx chamber id
## <0 rows> (or 0-length row.names)
sid <-spruce88 %>% group_by(chamber) %>% sample_n(5)
spruce_highlight <- spruce88 %>% filter(id %in% sid[["id"]])
# Emphasizing lines in the plot with background lighter
g_base <- ggplot(data = spruce88, mapping = aes(x=day, y=y, color = factor(tx), group=factor(id))) +
facet_wrap(~chamber)
(g_base +
geom_point(alpha=.2)+ geom_line(alpha=.2) + geom_line(data=spruce_highlight, aes(x=day, y=y, group=factor(id))))
# (g_base + geom_point(alpha=.2) + geom_smooth(span=.5, aes(group="abc")))
# Note: using a constant for the group, will override the previous grouping.
# Also, the smoothing will come up with "singularities" since there are only 5 distinct x data points, and lowess will have trouble.
Suggestion if the data is taken at many different points, just round it to the nearest year and we can still get some sense of the correlation over time. In the case everything is discrete, we can just use the data categories as is.
We should remove the effect of the means from each week. The suggestion in the book is to remove the covariate effect by fitting a regression on the data.
# remove the effect of means from each week
# residuals(lm(y~day + tx + chamber, data=spruce88))
str(spruce88)
## 'data.frame': 395 obs. of 5 variables:
## $ y : num 4.51 4.98 5.41 5.9 6.14 ...
## $ day : num -49 -27 0 26 57 -49 -27 0 26 57 ...
## $ tx : num 1 1 1 1 1 1 1 1 1 1 ...
## $ chamber: num 1 1 1 1 1 1 1 1 1 1 ...
## $ id : int 1 1 1 1 1 2 2 2 2 2 ...
spruce_resid <- spruce88 %>% mutate(resid = residuals(lm(y~day + tx + chamber)))
For exploratory analysis, we should show the unstructured estimated covariance matrix from the model, shown for an individual \(i\),
\[ \begin{aligned} Y_i = X_i\beta + \varepsilon_i \end{aligned} \]
assuming that \(\varepsilon_i \sim N(0,\Sigma)\). We show the estimate \(\hat \Sigma\) as unstructured covariance matrix:
This is table 5.3 in Fitzmaurice, Laird, and Ware (2011).
# direct group means covariance
tlc_gls_un_cov <- gls(lead ~ trt*week,
correlation=corSymm(form = ~ time | id),
weights = varIdent(form = ~ 1 | week),
data=tlc)
getVarCov(tlc_gls_un_cov)
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 25.226 19.107 19.700 22.202
## [2,] 19.107 44.346 35.535 29.675
## [3,] 19.700 35.535 47.377 30.620
## [4,] 22.202 29.675 30.620 58.651
## Standard Deviations: 5.0225 6.6593 6.8831 7.6584
We can also look at the correlation with a scatter plot.
tlc["week"] <- as.factor(tlc$week)
# Split into the exposed and placebo groups and make plots from the placebo group
tlc_exposed <- tlc %>% filter(trt == "A") %>% spread(week, lead)
tlc_placebo <- tlc %>% filter(trt == "P") %>% spread(week, lead)
pairs(tlc_placebo[3:6])
cor(tlc_placebo[3:6])
## week_int time w0 w1
## week_int 1.0000000 0.9844952 NA NA
## time 0.9844952 1.0000000 NA NA
## w0 NA NA 1 NA
## w1 NA NA NA 1
The variance covariance matrix can be extracted in two ways:
# using reStruct from model is scale version of covariance matrix
list(as.matrix(fat_lme$modelStruct$reStruct[[1]]) * fat_lme$sigma^2,
getVarCov(fat_lme))
## [[1]]
## (Intercept) time timepost
## (Intercept) 45.939153 2.526047 -6.109078
## time 2.526047 1.630971 -1.750363
## timepost -6.109078 -1.750363 2.749384
##
## [[2]]
## Random effects variance covariance matrix
## (Intercept) time timepost
## (Intercept) 45.9390 2.5260 -6.1091
## time 2.5260 1.6310 -1.7504
## timepost -6.1091 -1.7504 2.7494
## Standard Deviations: 6.7778 1.2771 1.6581
We can also model the within correlation differently with
correlation
corCAR1 - models the correlation by
phi = .2 (default), has autocorrelation function \(h(\cdot)\), with parameters \(s\) distance, and \(\phi\) correlation
\[ \begin{aligned} h(s, \phi) = \phi^s \quad s \geq 0, \phi \geq 0 \end{aligned} \]
We can manually create the correlation matrix using this function and
pairwise distances, or use the corStruct class in nlme.
list(manual = .2^dist(fat_post$time[fat_post$id == 1]),
lme = corMatrix(Initialize(corCAR1(form = ~time | id), data = fat_post))[[1]]) # phi = .2 default
## $manual
## 1 2 3 4 5
## 2 0.1968068917
## 3 0.0454964703 0.2311731563
## 4 0.0098617995 0.0501090149 0.2167596607
## 5 0.0018198587 0.0092469255 0.0399999969 0.1845361667
## 6 0.0003639718 0.0018493853 0.0080000000 0.0369072362 0.2000000156
##
## $lme
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1.0000000000 0.196806892 0.04549647 0.009861799 0.001819859 0.0003639718
## [2,] 0.1968068917 1.000000000 0.23117316 0.050109015 0.009246926 0.0018493853
## [3,] 0.0454964703 0.231173156 1.00000000 0.216759661 0.039999997 0.0080000000
## [4,] 0.0098617995 0.050109015 0.21675966 1.000000000 0.184536167 0.0369072362
## [5,] 0.0018198587 0.009246926 0.04000000 0.184536167 1.000000000 0.2000000156
## [6,] 0.0003639718 0.001849385 0.00800000 0.036907236 0.200000016 1.0000000000
fat_car1 <- lme(pbf~time + timepost,
random = ~ 1 | id,
corr=corCAR1(,form= ~ time | id),
data = fat_post)
fat_car1 %>% getVarCov(type = "marginal")
## id 1
## Marginal variance covariance matrix
## 1 2 3 4 5 6
## 1 48.672 39.818 35.798 33.626 32.441 31.897
## 2 39.818 48.672 40.439 35.991 33.563 32.448
## 3 35.798 40.439 48.672 40.185 35.554 33.427
## 4 33.626 35.991 40.185 48.672 39.581 35.408
## 5 32.441 33.563 35.554 39.581 48.672 39.878
## 6 31.897 32.448 33.427 35.408 39.878 48.672
## Standard Deviations: 6.9765 6.9765 6.9765 6.9765 6.9765 6.9765
fat_lme %>% getVarCov(type = "marginal")
## id 1
## Marginal variance covariance matrix
## 1 2 3 4 5 6
## 1 60.288 46.991 43.546 39.949 36.007 32.886
## 2 46.991 54.304 42.885 40.853 38.553 35.311
## 3 43.546 42.885 51.763 41.668 40.846 37.496
## 4 39.949 40.853 41.668 51.991 43.240 39.776
## 5 36.007 38.553 40.846 43.240 55.056 42.044
## 6 32.886 35.311 37.496 39.776 42.044 48.858
## Standard Deviations: 7.7645 7.3691 7.1946 7.2105 7.42 6.9898
getVarCov(fat_car1, type = "marginal")
## id 1
## Marginal variance covariance matrix
## 1 2 3 4 5 6
## 1 48.672 39.818 35.798 33.626 32.441 31.897
## 2 39.818 48.672 40.439 35.991 33.563 32.448
## 3 35.798 40.439 48.672 40.185 35.554 33.427
## 4 33.626 35.991 40.185 48.672 39.581 35.408
## 5 32.441 33.563 35.554 39.581 48.672 39.878
## 6 31.897 32.448 33.427 35.408 39.878 48.672
## Standard Deviations: 6.9765 6.9765 6.9765 6.9765 6.9765 6.9765
VarCorr(fat_car1)
## id = pdLogChol(1)
## Variance StdDev
## (Intercept) 31.37028 5.600917
## Residual 17.30157 4.159516
This section mostly uses gls with a variety of different
covariance matrices.
This dataset shows measuruments of pituitary gland to pteryomaxillary fissure. It has 11 girls and 16 boys at ages 8, 10, 12, 14.
dental_wide <- read.table("data/dental.txt",
header = FALSE,
col.names = c("id", "gender", "y1", "y2", "y3", "y4"))
# long data format
dental <- dental_wide %>%
pivot_longer(cols = y1:y4,
names_to = "age",
values_to = "distance") %>%
dplyr::mutate(
id = factor(id),
age = recode(age,
y1 = 8,
y2 = 10,
y3 = 12,
y4 = 14),
agef = factor(age),
.after = "age")
dental_mean <- dental %>%
group_by(age, gender) %>%
dplyr::summarize(avg_distance = mean(distance),
.groups = "drop_last")
dental %>% ggplot(aes(age, distance, group = id)) +
geom_point(alpha = .3) +
geom_line(alpha = .3) +
geom_line(data = dental_mean, mapping = aes(age, avg_distance, group = NULL), color = "red") +
facet_wrap(~gender)
# all fixed with id
dental_fixed <- lm(distance ~ age*gender + id, data = dental)
dental_lm <- lm(distance ~ age*gender, data = dental)
# modeling covariance
dental_ident <- gls(distance~age*gender, data = dental) # same as lm
## heterogenous
dental_het <- gls(distance~age*gender,
data = dental,
weights = varIdent(form = ~ 1 | age))
## unstructured
dental_un <- gls(distance ~ age*gender,
data = dental,
correlation = corSymm(form = ~1 | id))
## heterogenous unstructured
dental_hun <- gls(distance ~ age*gender,
data = dental,
correlation = corSymm(form = ~1 | id),
weights = varIdent(form = ~1 | age))
## compound symmetry
dental_cs <- gls(distance ~ age*gender,
data = dental,
correlation = corCompSymm(form = ~1 | id))
## Heterogenous CS
dental_csh <- gls(distance ~ age*gender,
data = dental,
correlation = corCompSymm(form = ~1 | id),
weights = varIdent(form = ~1 | age))
## Autoregressive
dental_ar <- gls(distance ~ age*gender,
data = dental,
correlation = corAR1(form = ~1 | id),
weights = varIdent(form = ~1 | age))
## Autoregressive Heterogeneous
dental_arh <- gls(distance ~ age*gender,
data = dental,
correlation = corAR1(form = ~1 | id),
weights = varIdent(form = ~1 | age))
getVarCov(dental_un)
getVarCov(dental_hun)
getVarCov(dental_cs)
getVarCov(dental_csh)
getVarCov(dental_ar)
getVarCov(dental_arh)
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 5.2300 3.0094 3.3345 2.6923
## [2,] 3.0094 5.2300 3.0035 3.9166
## [3,] 3.3345 3.0035 5.2300 3.7687
## [4,] 2.6923 3.9166 3.7687 5.2300
## Standard Deviations: 2.2869 2.2869 2.2869 2.2869
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 5.4252 2.7092 3.8411 2.7152
## [2,] 2.7092 4.1906 2.9745 3.3137
## [3,] 3.8411 2.9745 6.2632 4.1333
## [4,] 2.7152 3.3137 4.1333 4.9862
## Standard Deviations: 2.3292 2.0471 2.5026 2.233
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 5.2207 3.2986 3.2986 3.2986
## [2,] 3.2986 5.2207 3.2986 3.2986
## [3,] 3.2986 3.2986 5.2207 3.2986
## [4,] 3.2986 3.2986 3.2986 5.2207
## Standard Deviations: 2.2849 2.2849 2.2849 2.2849
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 5.6967 3.1209 3.7419 3.3309
## [2,] 3.1209 4.2365 3.2269 2.8724
## [3,] 3.7419 3.2269 6.0901 3.4440
## [4,] 3.3309 2.8724 3.4440 4.8256
## Standard Deviations: 2.3868 2.0583 2.4678 2.1967
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 5.8149 3.2683 2.3721 1.3044
## [2,] 3.2683 4.5807 3.3246 1.8281
## [3,] 2.3721 3.3246 6.0172 3.3087
## [4,] 1.3044 1.8281 3.3087 4.5369
## Standard Deviations: 2.4114 2.1403 2.453 2.13
## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4]
## [1,] 5.8149 3.2683 2.3721 1.3044
## [2,] 3.2683 4.5807 3.3246 1.8281
## [3,] 2.3721 3.3246 6.0172 3.3087
## [4,] 1.3044 1.8281 3.3087 4.5369
## Standard Deviations: 2.4114 2.1403 2.453 2.13
# Modelings including random effects
dental_lme <- lme(distance ~ age*gender,
random = ~1 | id,
data = dental)
dental_lmer <- lmer(distance~age*gender + (1 | id), data = dental)
# same model in lmer and lme, G unstructured
dental_lmer_age <- lmer(distance ~ age*gender + (age | id), data = dental)
dental_lme_age <- lme(distance ~ age*gender,
random = ~ age | id,
data = dental)
# uncorrelated age and id in random effect, G diag
dental_lme_age_diag <- lme(distance ~ age*gender,
data = dental,
random = list(id = pdDiag(form = ~age))) # heterogenous, but uncorrelated age and id (age || id) in lmer
dental_lmer_age_diag <- lmer(distance~age*gender + (age || id), data = dental)
getVarCov(dental_lme)
getVarCov(dental_lme_age)
getVarCov(dental_lme_age_diag)
getVarCov(dental_lme, type = "marginal") # same as CS variance
getVarCov(dental_lme_age, type = "marginal")
getVarCov(dental_lme_age_diag, type = "marginal")
## Random effects variance covariance matrix
## (Intercept)
## (Intercept) 3.2986
## Standard Deviations: 1.8162
## Random effects variance covariance matrix
## (Intercept) age
## (Intercept) 5.78640 -0.289630
## age -0.28963 0.032524
## Standard Deviations: 2.4055 0.18035
## Random effects variance covariance matrix
## (Intercept) age
## (Intercept) 2.4168 0.0000000
## age 0.0000 0.0077469
## Standard Deviations: 1.5546 0.088017
## id 1
## Marginal variance covariance matrix
## 1 2 3 4
## 1 5.2207 3.2986 3.2986 3.2986
## 2 3.2986 5.2207 3.2986 3.2986
## 3 3.2986 3.2986 5.2207 3.2986
## 4 3.2986 3.2986 3.2986 5.2207
## Standard Deviations: 2.2849 2.2849 2.2849 2.2849
## id 1
## Marginal variance covariance matrix
## 1 2 3 4
## 1 4.9502 3.1751 3.1162 3.0574
## 2 3.1751 4.9625 3.3176 3.3888
## 3 3.1162 3.3176 5.2351 3.7202
## 4 3.0574 3.3888 3.7202 5.7679
## Standard Deviations: 2.2249 2.2277 2.288 2.4016
## id 1
## Marginal variance covariance matrix
## 1 2 3 4
## 1 4.7772 3.0366 3.1605 3.2845
## 2 3.0366 5.0561 3.3464 3.5014
## 3 3.1605 3.3464 5.3970 3.7183
## 4 3.2845 3.5014 3.7183 5.7998
## Standard Deviations: 2.1857 2.2486 2.3231 2.4083
The covariance stuff is harder to calculate from lmer, I only know how to get from manual components, there may be something better that I don’t know about.
# G
VarCorr(dental_lmer_age)[[1]]
## (Intercept) age
## (Intercept) 5.7744874 -0.2886962
## age -0.2886962 0.0324516
## attr(,"stddev")
## (Intercept) age
## 2.4030163 0.1801433
## attr(,"correlation")
## (Intercept) age
## (Intercept) 1.0000000 -0.6669087
## age -0.6669087 1.0000000
# Manual calculation of G = Sigma^2 L'L
L <- getME(dental_lmer_age, "Lambda") # Marginal covariance
sig <- getME(dental_lmer_age, "sigma")
(crossprod(L) * sig^2)[1:2, 1:2]
## 2 x 2 sparse Matrix of class "dsCMatrix"
##
## [1,] 5.7889208 -0.01612650
## [2,] -0.0161265 0.01801819
# ZGZ' + R
Z <- getME(dental_lmer_age, "Z")
G <- Matrix(diag(27)) %x% VarCorr(dental_lmer_age)[[1]] # hadamard
R <- Matrix(diag(rep(sigma(dental_lmer_age)^2, 108))) #
varY <- Z %*% G %*% t(Z) + R # Var(Y)
varY[1:4, 1:4]
## 4 x 4 sparse Matrix of class "dgCMatrix"
## 1 2 3 4
## 1 4.948875 3.174083 3.115916 3.057749
## 2 3.174083 4.962347 3.317362 3.389001
## 3 3.115916 3.317362 5.235433 3.720254
## 4 3.057749 3.389001 3.720254 5.768131
# both criteria choose the cs matrix, the covariance is quite similar
AIC(dental_lm,
dental_het,
dental_un,
dental_hun,
dental_cs,
dental_csh,
dental_ar,
dental_arh,
dental_lme) %>%
add_column(BIC = BIC(
dental_lm,
dental_het,
dental_un,
dental_hun,
dental_cs,
dental_csh,
dental_ar,
dental_arh,
dental_lme)$BIC)
## Warning in AIC.default(dental_lm, dental_het, dental_un, dental_hun,
## dental_cs, : models are not all fitted to the same number of observations
## Warning in BIC.default(dental_lm, dental_het, dental_un, dental_hun,
## dental_cs, : models are not all fitted to the same number of observations
## df AIC BIC
## dental_lm 5 488.2418 501.6524
## dental_het 8 498.4114 519.5666
## dental_un 11 448.1706 477.2589
## dental_hun 14 452.5468 489.5683
## dental_cs 6 445.7572 461.6236
## dental_csh 9 449.9724 473.7719
## dental_ar 9 460.7962 484.5957
## dental_arh 9 460.7962 484.5957
## dental_lme 6 445.7572 461.6236
I was also curious how this compares to just calculating the sample covariance. It seems not all that different from the unstructured covariance matrix. They are different, but it seems the residuals being calculated are different. I thought ML estimator should be pretty close to the unstructured estimate.
# sample covariance with group centering
dental %>% group_by(gender, age) %>%
mutate(cdist = distance - mean(distance, na.rm = TRUE)) %>% # group center
pivot_wider(id_cols = c(age), names_from = id, values_from = cdist) %>% # matrix format for cov
ungroup() %>% select(-age) %>%
data.matrix() %>%
t() %>% cov()
## [,1] [,2] [,3] [,4]
## [1,] 5.207168 2.612325 3.759834 2.605988
## [2,] 2.612325 4.023820 2.814576 3.189576
## [3,] 3.759834 2.814576 6.207441 3.971864
## [4,] 2.605988 3.189576 3.971864 4.793979
# predictions from each of the models
dental_yhat <- dental %>% add_column(un = predict(dental_un),
hun = predict(dental_hun),
cs = predict(dental_cs),
csh = predict(dental_csh),
ar = predict(dental_ar),
arh = predict(dental_arh),
het = predict(dental_het),
lme = predict(dental_lme),
fixed = predict(dental_fixed),
lm = predict(dental_lm),
lme_age_diag = predict(dental_lme_age_diag),
lme_age = predict(dental_lme_age))
dental_yhat %>% filter(id == 2) %>%
pivot_longer(cols = distance:lme_age, names_to = "type",
values_to = "response") %>%
ggplot(aes(age, response, color = type)) +
geom_point() +
geom_line()
Here we examine what all these models actually predicted. For the marginal model, most of them are being predicted together, which makes sense. The “distance” are the raw values displayed for one individual. We can see that the fixed model (separate fixed effect for id) and total mean model (lm, also called pooled) runs with the other coefficients. As far as estimates go, there are 3 separate groups, and probably some are due to rounding error. I think theoretically, the gls beta estimates should all be the same?
Now we should also be concered about the effects of vcov
of the different models
The sandwich estimators here are also worth comparing. All estimators have the form \(X'\hat\Sigma X\) with different “meat” for \(\hat \Sigma\).
For Heteroscedastic errors, there are HAC and Feasible Generalized Least Squares:
# autocorrelated
NeweyWest(dental_lm) # Bartlett kernel weights
## (Intercept) age genderM age:genderM
## (Intercept) 1.5805513 -0.11935718 -1.5695856 0.11933808
## age -0.1193572 0.01170302 0.1289546 -0.01245172
## genderM -1.5695856 0.12895463 3.9660625 -0.33731829
## age:genderM 0.1193381 -0.01245172 -0.3373183 0.03234847
NeweyWest(dental_lm, lag = 0) #??
## (Intercept) age genderM age:genderM
## (Intercept) 2.3956703 -0.20743588 -2.4248178 0.20948506
## age -0.2074359 0.02111911 0.2205834 -0.02204344
## genderM -2.4248178 0.22058338 5.1489658 -0.46060717
## age:genderM 0.2094851 -0.02204344 -0.4606072 0.04503617
NeweyWest(dental_lm, lag = 1) # Bartlett kernel weights
## (Intercept) age genderM age:genderM
## (Intercept) 2.1562577 -0.17466452 -2.1221197 0.17190552
## age -0.1746645 0.01688017 0.1804904 -0.01725257
## genderM -2.1221197 0.18049043 4.8322769 -0.41689414
## age:genderM 0.1719055 -0.01725257 -0.4168941 0.03939289
vcovHAC(dental_lm)
## (Intercept) age genderM age:genderM
## (Intercept) 0.98793825 -0.076828444 -0.85989312 0.061591713
## age -0.07682844 0.008502217 0.07210153 -0.007497458
## genderM -0.85989312 0.072101528 2.07233695 -0.160873344
## age:genderM 0.06159171 -0.007497458 -0.16087334 0.015550636
vcovHAC(dental_lm, weights = weightsAndrews)
## (Intercept) age genderM age:genderM
## (Intercept) 0.98793825 -0.076828444 -0.85989312 0.061591713
## age -0.07682844 0.008502217 0.07210153 -0.007497458
## genderM -0.85989312 0.072101528 2.07233695 -0.160873344
## age:genderM 0.06159171 -0.007497458 -0.16087334 0.015550636
vcovHAC(dental_lm, weights = weightsLumley)
## (Intercept) age genderM age:genderM
## (Intercept) 2.3180590 -0.19769599 -2.6024990 0.21980484
## age -0.1976960 0.01957897 0.2314731 -0.02220841
## genderM -2.6024990 0.23147308 5.2298357 -0.44622075
## age:genderM 0.2198048 -0.02220841 -0.4462207 0.04203649
# vcovPL is most appropriate I think, bu
vcovPL(dental_lm, cluster = ~age , kernel = "Bartlett")
## (Intercept) age genderM age:genderM
## (Intercept) 0.36436517 -0.03429299 -0.37025009 0.03404624
## age -0.03429299 0.00491551 0.03440182 -0.00480140
## genderM -0.37025009 0.03440182 1.50047722 -0.12229004
## age:genderM 0.03404624 -0.00480140 -0.12229004 0.01217204
vcovPL(dental_lm, cluster = dental$agef) # on the order
## (Intercept) age genderM age:genderM
## (Intercept) 0.36436517 -0.03429299 -0.37025009 0.03404624
## age -0.03429299 0.00491551 0.03440182 -0.00480140
## genderM -0.37025009 0.03440182 1.50047722 -0.12229004
## age:genderM 0.03404624 -0.00480140 -0.12229004 0.01217204
vcovCL(dental_lm, cluster = ~id)
## (Intercept) age genderM age:genderM
## (Intercept) 0.56190635 -0.031179559 -0.56190635 0.031179559
## age -0.03117956 0.004258417 0.03117956 -0.004258417
## genderM -0.56190635 0.031179559 2.02816740 -0.145146882
## age:genderM 0.03117956 -0.004258417 -0.14514688 0.014592405
vcov(dental_cs)
## (Intercept) age genderM age:genderM
## (Intercept) 1.4006890 -0.096102802 -1.4006890 0.096102802
## age -0.0961028 0.008736618 0.0961028 -0.008736618
## genderM -1.4006890 0.096102802 2.3636627 -0.162173479
## age:genderM 0.0961028 -0.008736618 -0.1621735 0.014743044
vcov(dental_lme_age) # is cluster supposed to be age?
## (Intercept) age genderM age:genderM
## (Intercept) 1.5089562 -0.1121399 -1.5089562 0.11213994
## age -0.1121399 0.0107577 0.1121399 -0.01075770
## genderM -1.5089562 0.1121399 2.5463635 -0.18923615
## age:genderM 0.1121399 -0.0107577 -0.1892361 0.01815361
# standard errors are wildly different, why and how? Need to test reliability.
tidy(dental_csh)
## # A tibble: 4 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 17.4 1.18 14.7 3.08e-27
## 2 age 0.479 0.0929 5.15 1.22e- 6
## 3 genderM -1.27 1.53 -0.831 4.08e- 1
## 4 age:genderM 0.316 0.121 2.62 1.02e- 2
coeftest(dental_lm, vcov = vcovCL, cluster = ~id)
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.372727 0.749604 23.1759 < 2.2e-16 ***
## age 0.479545 0.065257 7.3486 4.712e-11 ***
## genderM -1.032102 1.424137 -0.7247 0.47025
## age:genderM 0.304830 0.120799 2.5234 0.01313 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coeftest(dental_lm, vcov = vcovPL, cluster = ~id) # almost 10x lower standard error?
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.372727 0.056518 307.3812 < 2.2e-16 ***
## age 0.479546 0.004701 102.0090 < 2.2e-16 ***
## genderM -1.032102 0.527300 -1.9573 0.05299 .
## age:genderM 0.304829 0.043791 6.9610 3.128e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coeftest(dental_lm, vcov = vcovHAC)
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.372727 0.993951 17.4785 < 2.2e-16 ***
## age 0.479545 0.092207 5.2007 9.998e-07 ***
## genderM -1.032102 1.439561 -0.7170 0.47501
## age:genderM 0.304830 0.124702 2.4445 0.01619 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Introducing this many different ways for estimating “robust” standard errors is quite confusing.
There are 4 ways that we can discuss for baseline adjustment from Fitzmaurice, Laird, and Ware (2011) chapter 5.7
In summary, Fitzmaurice shows and recommends:
Depending on the strategy that you choose, the interpretation of the coefficients will differ, so beware, although some cases reduce to linear combinations of the coefficients.
We examine the differences of these 4 approaches on the TLC dataset.
This method is just the standard response profile analysis, which we’ve shown above.
tlc_baseline_1 <- tlc_gls
In order to force baseline to be modeled the same, we need to remove
the main effect for treament, and the interaction term involving the
baseline. Unfortunately we can’t remove a single level of the
interaction if we specify week*trt, nor does
gls accept a manually made model matrix, so we must
construct the variables through the formula interface.
# keep in outcome, assume similar intercept
# remove main effect, and week0 interactions.
# gls doesn't take model.matrix, so manual specification through formula is the way to do it
# update also acts kinda weird
# table 5.7 in Fitzmaurice
# https://content.sph.harvard.edu/fitzmaur/ala2e/
tlc_baseline_2 <- gls(lead ~ week +
I(week == "w1" & trt == "A") +
I(week == "w4" & trt == "A") +
I(week == "w6" & trt == "A"),
data = tlc,
correlation = corSymm(form = ~ time | id),
weights = varIdent(form = ~ 1 | week))
# type III fixed effect test for interaction hypothesis
anova(tlc_baseline_2, Terms = 3:5) # Chisq = F * ndf = 111.94
## Denom. DF: 393
## F-test for: I(week == "w1" & trt == "A"), I(week == "w4" & trt == "A"), I(week == "w6" & trt == "A")
## numDF F-value p-value
## 1 3 37.31422 <.0001
This analysis is just a reframed version of the first analysis, at least in terms of estimates.
# modeling the difference
# time - 1, because gls throws error. corSymm needs consecutive integers starting at 1.
tlc_baseline_3 <- gls(lead_diff ~ trt*week,
correlation = corSymm(form = ~ time-1 | id), # unstructured
weights = varIdent(form = ~ 1 | week), # heterogenous
data = tlc_diff %>% mutate(trt = factor(trt, levels = c("P", "A"))))
anova(tlc_baseline_3, Terms = c(2, 4)) # trt x week interaction test is now test on combined c(trt, trt:week) of diff
## Denom. DF: 294
## F-test for: trt, trt:week
## numDF F-value p-value
## 1 3 35.92872 <.0001
| term | change |
|---|---|
| (Intercept) | -1.612 |
| trtA | -11.406 |
| weekw4 | -0.590 |
| weekw6 | -1.014 |
| trtA:weekw4 | 2.582 |
| trtA:weekw6 | 8.254 |
| term | raw |
|---|---|
| (Intercept) | 26.272 |
| trtA | 0.268 |
| weekw1 | -1.612 |
| weekw4 | -2.202 |
| weekw6 | -2.626 |
| trtA:weekw1 | -11.406 |
| trtA:weekw4 | -8.824 |
| trtA:weekw6 | -3.152 |
To see that they are the same, notice that the baseline change model term estimates are simply linear combinations of the raw model.
tribble(~`change model`, ~`raw model terms`, ~`value` , ~`interpretation`,
"(Intercept)", "weekw1", "-1.612", "in placebo, diff between w1 and baseline",
"trtA", "trtA:weekw1", "-11.406", "difference in trt slopes from baseline to w1",
"weekw4", "weekw4 - weekw1","-0.59", "in placebo, diff between w4 and w1",
"trtA:weekw4", "trtA:weekw4 - trtA:weekw1","2.582", "diff in trt slope from w4 to w1") %>% kbl() %>% kable_styling(full_width = FALSE)
| change model | raw model terms | value | interpretation |
|---|---|---|---|
| (Intercept) | weekw1 | -1.612 | in placebo, diff between w1 and baseline |
| trtA | trtA:weekw1 | -11.406 | difference in trt slopes from baseline to w1 |
| weekw4 | weekw4 - weekw1 | -0.59 | in placebo, diff between w4 and w1 |
| trtA:weekw4 | trtA:weekw4 - trtA:weekw1 | 2.582 | diff in trt slope from w4 to w1 |
Given that they are the same (or linear combinations), what about the standard errors of the estimates? Let’s check for “weekw4”
cbind(c(-1, 1) %*% tlc_gls$varBeta[3:4, 3:4] %*% c(-1, 1), # old model std err
tlc_baseline_3$varBeta["weekw4", "weekw4"]) # baseline change model std err
## [,1] [,2]
## [1,] 0.4130701 0.413059
They are the same!
list(raw = tlc_gls$varBeta,
base3=tlc_baseline_3$varBeta)
## $raw
## (Intercept) trtA weekw1 weekw4 weekw6
## (Intercept) 0.50451606 -0.50451606 -0.1223674 -0.1105218 -0.06048317
## trtA -0.50451606 1.00903212 0.1223674 0.1105218 0.06048317
## weekw1 -0.12236737 0.12236737 0.6271371 0.4390662 0.27183933
## weekw4 -0.11052175 0.11052175 0.4390662 0.6640654 0.27889731
## weekw6 -0.06048317 0.06048317 0.2718393 0.2788973 0.78947714
## trtA:weekw1 0.12236737 -0.24473475 -0.6271371 -0.4390662 -0.27183933
## trtA:weekw4 0.11052175 -0.22104350 -0.4390662 -0.6640654 -0.27889731
## trtA:weekw6 0.06048317 -0.12096634 -0.2718393 -0.2788973 -0.78947714
## trtA:weekw1 trtA:weekw4 trtA:weekw6
## (Intercept) 0.1223674 0.1105218 0.06048317
## trtA -0.2447347 -0.2210435 -0.12096634
## weekw1 -0.6271371 -0.4390662 -0.27183933
## weekw4 -0.4390662 -0.6640654 -0.27889731
## weekw6 -0.2718393 -0.2788973 -0.78947714
## trtA:weekw1 1.2542741 0.8781324 0.54367866
## trtA:weekw4 0.8781324 1.3281308 0.55779461
## trtA:weekw6 0.5436787 0.5577946 1.57895428
##
## $base3
## (Intercept) trtA weekw4 weekw6 trtA:weekw4
## (Intercept) 0.6271416 -0.6271416 -0.1880546 -0.3553027 0.1880546
## trtA -0.6271416 1.2542831 0.1880546 0.3553027 -0.3761093
## weekw4 -0.1880546 0.1880546 0.4130590 0.1951241 -0.4130590
## weekw6 -0.3553027 0.3553027 0.1951241 0.8729404 -0.1951241
## trtA:weekw4 0.1880546 -0.3761093 -0.4130590 -0.1951241 0.8261179
## trtA:weekw6 0.3553027 -0.7106053 -0.1951241 -0.8729404 0.3902482
## trtA:weekw6
## (Intercept) 0.3553027
## trtA -0.7106053
## weekw4 -0.1951241
## weekw6 -0.8729404
## trtA:weekw4 0.3902482
## trtA:weekw6 1.7458807
The marginal covariances are also the same if we change the estimation method to ML.
The idea here was that we could calculate the change from baseline covariance estimate from the raw model estimates of covariance.
\[ \begin{aligned} \underbrace{\widehat{\mathrm{Cov}}(Y_{i2} -Y_{i1}, Y_{i2} -Y_{i1})}_{\Delta \text{ baseline model estimates}} = \underbrace{\widehat{\mathrm{Var}}(Y_{{i2}}) - 2\widehat{\mathrm{Cov}}(Y_{i2}, Y_{i1}) + \widehat{\mathrm{Var}}(Y_{i1})}_{\text{Raw model estimates}} \end{aligned} \]
# We note this is NOT true when we use the REML estimates
# linear
L <- matrix(c(-1, 1, 0, 0,
-1, 0, 1, 0,
-1, 0, 0, 1), byrow = TRUE, ncol = 4)
# not the same!
list(`from_baseline_change_model` = getVarCov(tlc_baseline_3),
`from_raw_model` = L %*% getVarCov(tlc_baseline_1) %*% t(L))
## $from_baseline_change_model
## Marginal variance covariance matrix
## [,1] [,2] [,3]
## [1,] 31.357 21.954 13.592
## [2,] 21.954 33.205 13.945
## [3,] 13.592 13.945 39.474
## Standard Deviations: 5.5997 5.7623 6.2828
##
## $from_raw_model
## [,1] [,2] [,3]
## [1,] 31.35685 21.95331 13.59197
## [2,] 21.95331 33.20327 13.94487
## [3,] 13.59197 13.94487 39.47386
With REML estimates, they are NOT equal. But if we refit the models with ML,
# what if we try ML?
tlc_baseline_3_ml <- tlc_baseline_3 %>% update(method = "ML")
tlc_baseline_1_ml <- tlc_gls %>% update(method = "ML")
list(`from_baseline_change_model` = getVarCov(tlc_baseline_3_ml),
`from_raw_model` = L %*% getVarCov(tlc_baseline_1_ml) %*% t(L))
## $from_baseline_change_model
## Marginal variance covariance matrix
## [,1] [,2] [,3]
## [1,] 30.729 21.515 13.319
## [2,] 21.515 32.541 13.666
## [3,] 13.319 13.666 38.684
## Standard Deviations: 5.5434 5.7044 6.2196
##
## $from_raw_model
## [,1] [,2] [,3]
## [1,] 30.72886 21.51453 13.31984
## [2,] 21.51453 32.54042 13.66637
## [3,] 13.31984 13.66637 38.68419
This method results in standard error estimates that are more efficient, but generally only recommended for randomized studies. In observational studies, there’s danger in introducing bias.
We can do ancova on the raw values, or change from baseline values. That is, we could consider either:
Both of these will give the same estimates (maybe obviously when written out this way), but \(\gamma' = \gamma + 1\).
# ancova
tlc_baseline_4 <- gls(lead ~ trt*week + lead_base,
correlation = corSymm(form = ~ time-1 | id), # unstructured
weights = varIdent(form = ~ 1 | week), # heterogeneous
data = tlc_diff)
tlc_baseline_4b <- gls(lead_diff ~ trt*week + lead_base,
correlation = corSymm(form = ~ time-1 | id), # unstructured
weights = varIdent(form = ~ 1 | week), # heterogeneous
data = tlc_diff)
cbind(coef(tlc_baseline_4),
coef(tlc_baseline_4b)) %>%
`colnames<-`(c("Ancova", "Ancova Diff")) %>%
as.data.frame() %>%
rownames_to_column("term")
## term Ancova Ancova Diff
## 1 (Intercept) 3.5236964 3.5236964
## 2 trtA -11.3536109 -11.3536109
## 3 weekw4 -0.5900000 -0.5900000
## 4 weekw6 -1.0140000 -1.0140000
## 5 lead_base 0.8045183 -0.1954817
## 6 trtA:weekw4 2.5820000 2.5820000
## 7 trtA:weekw6 8.2540000 8.2540000
It’s hard to compare the estimates of the coefficients directly, so we compare them for the hypothesis test of the treatment by time interaction.
# contrast matrix for tlc_baseline_3
# trtA (interaction w1), trtA + trtA:w4, trtA + trtA:w6
L <- matrix(c(0, 1, 0, 0, 0, 0,
0, 1, 0, 0, 1, 0,
0, 1, 0, 0, 0, 1), byrow = T, nrow = 3)
rbind(
`raw` = anova(tlc_baseline_1, Terms = 4),
`constrain baseline` = anova(tlc_baseline_2, Terms = 3:5),
`baseline change` = anova(tlc_baseline_3, L = L), # same as tlc_baseline_1 as expected
`ancova` = anova(tlc_baseline_4, Terms = c(2,5))) %>% # similarly efficient as baseline 2 model
as.data.frame() %>%
mutate(`Chisq` = numDF * `F-value`,
.after= `F-value`)
## numDF F-value Chisq p-value
## raw 3 35.92934 107.7880 1.557061e-20
## constrain baseline 3 37.31422 111.9427 3.074120e-21
## baseline change 3 35.92872 107.7862 8.227085e-20
## ancova 3 37.04297 111.1289 2.517144e-20
We see that 2 and 4 have larger F-statistics, implying that there is slightly more power to detect the group differences (assuming the assumptions are reasonable).
Fitzmaurice, Laird, and Ware (2011) concludes that 2 should be recommended, because we get similar efficiency gains, and there is an implicit assumption about the covariance matrix of the ancova model. In the ANCOVA model, we only have a 3x3 covariance matrix, where as in baseline change model, we estimated a 4x4 covariance matrix. Hence, we are implicitly assuming that \(\mathrm{Cov}(Y_{i1}, Y_{i2}) = \mathrm{Cov}(Y_{i1}, Y_{i3}) = \mathrm{Cov}(Y_{i1}, Y_{i4})\)
tlc_baseline_4 %>% getVarCov()
## Marginal variance covariance matrix
## [,1] [,2] [,3]
## [1,] 30.151 20.864 12.991
## [2,] 20.864 32.230 13.460
## [3,] 12.991 13.460 39.478
## Standard Deviations: 5.491 5.6771 6.2831
Residuals from a longitudinal model will be correlated. Fitzmaurice, Laird, and Ware (2011) recommends that we “decorrelate” them with a cholesky decomposition of the estimated covariance of the errors. It can get confusing with the estimators and terminology here…
\[ \begin{aligned} \varepsilon &= Y_i - X_i\beta \\ \hat\varepsilon &= Y_i - X_i \hat\beta \\ \mathrm{Cov}(\varepsilon) &= \Sigma \\ \widehat{\mathrm{Cov}}(\varepsilon) &= \hat\Sigma \\ \mathrm{Cov}(\hat\varepsilon) &= ?? \end{aligned} \]
We assume that \(\mathrm{Cov}(\hat \varepsilon) \approx \mathrm{Cov}(\varepsilon)\).
# can extract residuals with type = "normalized"
fat_post %>% count(id)
## id n
## 1 1 6
## 2 2 9
## 3 3 8
## 4 4 6
## 5 5 7
## 6 6 8
## 7 7 9
## 8 8 3
## 9 9 7
## 10 10 7
## 11 11 7
## 12 12 6
## 13 13 5
## 14 14 8
## 15 15 9
## 16 16 10
## 17 17 8
## 18 18 9
## 19 19 9
## 20 20 8
## 21 21 9
## 22 22 6
## 23 23 5
## 24 24 8
## 25 25 7
## 26 26 7
## 27 27 5
## 28 28 7
## 29 29 6
## 30 30 8
## 31 31 6
## 32 32 5
## 33 33 5
## 34 34 5
## 35 35 6
## 36 36 7
## 37 37 6
## 38 38 8
## 39 39 6
## 40 40 6
## 41 41 5
## 42 42 9
## 43 43 5
## 44 44 3
## 45 45 7
## 46 46 7
## 47 47 6
## 48 48 5
## 49 49 7
## 50 50 8
## 51 51 8
## 52 52 7
## 53 53 8
## 54 54 7
## 55 55 9
## 56 56 8
## 57 57 6
## 58 58 5
## 59 59 5
## 60 60 5
## 61 61 8
## 62 62 8
## 63 63 7
## 64 64 6
## 65 65 7
## 66 66 8
## 67 67 8
## 68 68 7
## 69 69 7
## 70 70 3
## 71 71 8
## 72 72 6
## 73 73 8
## 74 74 8
## 75 75 7
## 76 76 6
## 77 77 3
## 78 78 8
## 79 79 7
## 80 80 8
## 81 81 6
## 82 82 6
## 83 83 6
## 84 84 6
## 85 85 6
## 86 86 8
## 87 87 8
## 88 88 5
## 89 89 3
## 90 90 3
## 91 91 5
## 92 92 8
## 93 93 7
## 94 94 6
## 95 95 7
## 96 96 6
## 97 97 5
## 98 98 8
## 99 99 8
## 100 100 8
## 101 101 8
## 102 102 8
## 103 103 6
## 104 104 5
## 105 105 5
## 106 106 5
## 107 107 6
## 108 108 4
## 109 109 7
## 110 110 7
## 111 111 8
## 112 112 7
## 113 113 7
## 114 114 7
## 115 115 7
## 116 116 7
## 117 117 6
## 118 118 5
## 119 119 5
## 120 120 7
## 121 121 7
## 122 122 7
## 123 123 7
## 124 124 5
## 125 125 6
## 126 126 4
## 127 127 5
## 128 128 7
## 129 129 7
## 130 130 6
## 131 131 6
## 132 132 7
## 133 133 7
## 134 134 4
## 135 135 7
## 136 136 6
## 137 137 7
## 138 138 7
## 139 139 7
## 140 140 7
## 141 141 7
## 142 142 7
## 143 143 5
## 144 144 6
## 145 145 7
## 146 146 7
## 147 147 5
## 148 148 5
## 149 149 5
## 150 150 6
## 151 151 6
## 152 152 6
## 153 153 6
## 154 154 6
## 155 155 6
## 156 156 6
## 157 157 6
## 158 158 6
## 159 159 5
## 160 160 3
## 161 161 5
## 162 162 5
S <- getVarCov(fat_lme, type = "marginal", individuals = 1)
S
## id 1
## Marginal variance covariance matrix
## 1 2 3 4 5 6
## 1 60.288 46.991 43.546 39.949 36.007 32.886
## 2 46.991 54.304 42.885 40.853 38.553 35.311
## 3 43.546 42.885 51.763 41.668 40.846 37.496
## 4 39.949 40.853 41.668 51.991 43.240 39.776
## 5 36.007 38.553 40.846 43.240 55.056 42.044
## 6 32.886 35.311 37.496 39.776 42.044 48.858
## Standard Deviations: 7.7645 7.3691 7.1946 7.2105 7.42 6.9898
R <- S[[1]] %>% chol()
R
## 1 2 3 4 5 6
## 1 7.764524 6.051983 5.608287 5.145088 4.637412 4.235458
## 2 0.000000 4.204493 2.127138 2.310672 2.494246 2.301845
## 3 0.000000 0.000000 3.973052 1.987756 2.399240 2.226406
## 4 0.000000 0.000000 0.000000 4.028538 2.196216 2.045436
## 5 0.000000 0.000000 0.000000 0.000000 4.092661 1.668148
## 6 0.000000 0.000000 0.000000 0.000000 0.000000 3.700942
crossprod(R) # R'R
## 1 2 3 4 5 6
## 1 60.28784 46.99077 43.54568 39.94916 36.00730 32.88632
## 2 46.99077 54.30426 42.88479 40.85319 38.55258 35.31101
## 3 43.54568 42.88479 51.76274 41.66771 40.84585 37.49563
## 4 39.94916 40.85319 41.66771 51.99143 43.23992 39.77628
## 5 36.00730 38.55258 40.84585 43.23992 55.05645 42.04400
## 6 32.88632 35.31101 37.49563 39.77628 42.04400 48.85798
eps_hat <- residuals(fat_lme, type = "pearson")
eps_hat[fat_post$id == 1]
## 1 1 1 1 1 1
## -1.4696365 0.6821217 -0.3313941 2.4943915 -2.0320743 0.2480700
Rinv <- solve(R)
eps_hat[fat_post$id == 1] %*% Rinv
## 1 2 3 4 5 6
## [1,] -0.1892758 0.4346816 -0.04895698 0.6357493 -0.8594188 0.07874242
residuals(fat_lme, type = "normalized")[fat_post$id == 1]
## 1 1 1 1 1 1
## -1.4696365 0.6821217 -0.3313941 2.4943915 -2.0320743 0.2480700
fat_lme$modelStruct
## reStruct parameters:
## id1 id2 id3 id4 id5 id6
## 0.7894148 -0.9241612 -1.3828910 0.1210868 -0.2928406 -0.3762166
?nlme:::recalc
nlme:::residuals.lme
## function (object, level = Q, type = c("response", "pearson",
## "normalized"), asList = FALSE, ...)
## {
## type <- match.arg(type)
## Q <- object$dims$Q
## val <- object[["residuals"]]
## if (is.character(level)) {
## nlevel <- match(level, names(val))
## if (any(aux <- is.na(nlevel))) {
## stop(sprintf(ngettext(sum(aux), "nonexistent level %s",
## "nonexistent levels %s"), level[aux]), domain = NA)
## }
## level <- nlevel
## }
## else {
## level <- 1 + level
## }
## if (type != "response") {
## val <- val[, level]/attr(val, "std")
## }
## else {
## val <- val[, level]
## }
## if (type == "normalized") {
## if (!is.null(cSt <- object$modelStruct$corStruct)) {
## val <- recalc(cSt, list(Xy = as.matrix(val)))$Xy[,
## seq_along(level)]
## }
## else {
## type <- "pearson"
## }
## }
## if (length(level) == 1) {
## grps <- as.character(object[["groups"]][, max(c(1, level -
## 1))])
## if (asList) {
## val <- as.list(split(val, ordered(grps, levels = unique(grps))))
## }
## else {
## grp.nm <- row.names(object[["groups"]])
## val <- naresid(object$na.action, val)
## names(val) <- grps[match(names(val), grp.nm)]
## }
## attr(val, "label") <- switch(type, response = {
## if (!is.null(aux <- attr(object, "units")$y)) paste("Residuals",
## aux) else "Residuals"
## }, pearson = "Standardized residuals", normalized = "Normalized residuals")
## val
## }
## else naresid(object$na.action, val)
## }
## <bytecode: 0x7fa806683640>
## <environment: namespace:nlme>
The next method deals with aggregation
The Econometrics methods that deal with longitudinal data is a little different than statistics. We’ll explore the terminology and methods here.
Provides functions for panel data from “econometricians” point of view.
Provides a few functions
plmdata("EmplUK")
# pdata.frame creats a dataframe for plm to work with
E <- pdata.frame(EmplUK, # The orig dataframe
index=c("firm","year"), # The "individual" and "time" variable names
drop.index = TRUE, # Don't show index columns
row.names = TRUE) # Show rownames that combine individual-time
class(E) # pdata.frame
## [1] "pdata.frame" "data.frame"
summary(E$emp) # Selecting the column from a pdata.frame give a "pseries" object
## total sum of squares: 261539.4
## id time
## 0.980765381 0.009108488
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.104 1.181 2.287 7.892 7.020 108.562
methods(class="pseries") # Looking at the methods with a pseries object
## [1] as.matrix between Between Complex
## [5] D diff F G
## [9] index is.pbalanced is.pconsecutive L
## [13] lag lead make.pbalanced make.pconsecutive
## [17] Math Ops pcdtest pdim
## [21] plot print pvar Sum
## [25] summary Within
## see '?methods' for accessing help and source code
# plm:::print.summary.pseries
# plm:::summary.pseries # The function for summarizing the series
pseries datatype comes with various functions to operate on them to compute correctly time lag within each individual
between(E$emp) # means across the time periods
Sum(E$emp) # sum across the time periods
Within(E$emp) # deviation from mean
head(plm:::lag.pseries(E$emp, 0:2)) # Lag the sequence within subject by however many
data(Grunfeld)
Grunfeld$firm <- factor(Grunfeld$firm)
Grunfeld$year <- factor(Grunfeld$year)
skim(Grunfeld)
| Name | Grunfeld |
| Number of rows | 200 |
| Number of columns | 5 |
| _______________________ | |
| Column type frequency: | |
| factor | 2 |
| numeric | 3 |
| ________________________ | |
| Group variables | None |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| firm | 0 | 1 | FALSE | 10 | 1: 20, 2: 20, 3: 20, 4: 20 |
| year | 0 | 1 | FALSE | 20 | 193: 10, 193: 10, 193: 10, 193: 10 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| inv | 0 | 1 | 145.96 | 216.88 | 0.93 | 33.56 | 57.48 | 138.04 | 1486.7 | ▇▁▁▁▁ |
| value | 0 | 1 | 1081.68 | 1314.47 | 58.12 | 199.98 | 517.95 | 1679.85 | 6241.7 | ▇▂▁▁▁ |
| capital | 0 | 1 | 276.02 | 301.10 | 0.80 | 79.17 | 205.60 | 358.10 | 2226.3 | ▇▁▁▁▁ |
Grunfeld %>% ggplot(aes(year, inv, color = factor(firm))) +
geom_line()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?
grun.po <- plm(inv~value + capital, data = Grunfeld, model = "pooling") # pooled
grun.fe <- plm(inv~value + capital, data = Grunfeld, model = "within") # fixed effects
grun.re <- plm(inv~value + capital, data = Grunfeld, model = "random") # random effects
grun.fd <- plm(inv~value + capital, data = Grunfeld, model = "fd") # first difference
summary(grun.po)
## Pooling Model
##
## Call:
## plm(formula = inv ~ value + capital, data = Grunfeld, model = "pooling")
##
## Balanced Panel: n = 10, T = 20, N = 200
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -291.6757 -30.0137 5.3033 34.8293 369.4464
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) -42.7143694 9.5116760 -4.4907 1.207e-05 ***
## value 0.1155622 0.0058357 19.8026 < 2.2e-16 ***
## capital 0.2306785 0.0254758 9.0548 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 9359900
## Residual Sum of Squares: 1755900
## R-Squared: 0.81241
## Adj. R-Squared: 0.8105
## F-statistic: 426.576 on 2 and 197 DF, p-value: < 2.22e-16
summary(grun.fe)
## Oneway (individual) effect Within Model
##
## Call:
## plm(formula = inv ~ value + capital, data = Grunfeld, model = "within")
##
## Balanced Panel: n = 10, T = 20, N = 200
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -184.00857 -17.64316 0.56337 19.19222 250.70974
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## value 0.110124 0.011857 9.2879 < 2.2e-16 ***
## capital 0.310065 0.017355 17.8666 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 2244400
## Residual Sum of Squares: 523480
## R-Squared: 0.76676
## Adj. R-Squared: 0.75311
## F-statistic: 309.014 on 2 and 188 DF, p-value: < 2.22e-16
summary(grun.re)
## Oneway (individual) effect Random Effect Model
## (Swamy-Arora's transformation)
##
## Call:
## plm(formula = inv ~ value + capital, data = Grunfeld, model = "random")
##
## Balanced Panel: n = 10, T = 20, N = 200
##
## Effects:
## var std.dev share
## idiosyncratic 2784.46 52.77 0.282
## individual 7089.80 84.20 0.718
## theta: 0.8612
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -177.6063 -19.7350 4.6851 19.5105 252.8743
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## (Intercept) -57.834415 28.898935 -2.0013 0.04536 *
## value 0.109781 0.010493 10.4627 < 2e-16 ***
## capital 0.308113 0.017180 17.9339 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 2381400
## Residual Sum of Squares: 548900
## R-Squared: 0.7695
## Adj. R-Squared: 0.76716
## Chisq: 657.674 on 2 DF, p-value: < 2.22e-16
summary(grun.fd)
## Oneway (individual) effect First-Difference Model
##
## Call:
## plm(formula = inv ~ value + capital, data = Grunfeld, model = "fd")
##
## Balanced Panel: n = 10, T = 20, N = 200
## Observations used in estimation: 190
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -200.889558 -13.889063 0.016677 9.504223 195.634938
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) -1.8188902 3.5655931 -0.5101 0.6106
## value 0.0897625 0.0083636 10.7325 < 2.2e-16 ***
## capital 0.2917667 0.0537516 5.4281 1.752e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 584410
## Residual Sum of Squares: 345460
## R-Squared: 0.40888
## Adj. R-Squared: 0.40256
## F-statistic: 64.6736 on 2 and 187 DF, p-value: < 2.22e-16
# Statistical equivalent
summary(lm(inv~value + capital, data = Grunfeld)) # pooled
##
## Call:
## lm(formula = inv ~ value + capital, data = Grunfeld)
##
## Residuals:
## Min 1Q Median 3Q Max
## -291.68 -30.01 5.30 34.83 369.45
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -42.714369 9.511676 -4.491 1.21e-05 ***
## value 0.115562 0.005836 19.803 < 2e-16 ***
## capital 0.230678 0.025476 9.055 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 94.41 on 197 degrees of freedom
## Multiple R-squared: 0.8124, Adjusted R-squared: 0.8105
## F-statistic: 426.6 on 2 and 197 DF, p-value: < 2.2e-16
summary(lm(inv~firm + capital + value, data = Grunfeld)) # fixed effects
##
## Call:
## lm(formula = inv ~ firm + capital + value, data = Grunfeld)
##
## Residuals:
## Min 1Q Median 3Q Max
## -184.009 -17.643 0.563 19.192 250.710
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -70.29672 49.70796 -1.414 0.159
## firm2 172.20253 31.16126 5.526 1.08e-07 ***
## firm3 -165.27512 31.77556 -5.201 5.14e-07 ***
## firm4 42.48742 43.90988 0.968 0.334
## firm5 -44.32010 50.49226 -0.878 0.381
## firm6 47.13542 46.81068 1.007 0.315
## firm7 3.74324 50.56493 0.074 0.941
## firm8 12.75106 44.05263 0.289 0.773
## firm9 -16.92555 48.45327 -0.349 0.727
## firm10 63.72887 50.33023 1.266 0.207
## capital 0.31007 0.01735 17.867 < 2e-16 ***
## value 0.11012 0.01186 9.288 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 52.77 on 188 degrees of freedom
## Multiple R-squared: 0.9441, Adjusted R-squared: 0.9408
## F-statistic: 288.5 on 11 and 188 DF, p-value: < 2.2e-16
summary(lmer(inv~ (1|firm) + capital + value, data = Grunfeld)) # random model, but different random effect estimators
## Warning: Some predictor variables are on very different scales: consider
## rescaling
## Linear mixed model fit by REML ['lmerMod']
## Formula: inv ~ (1 | firm) + capital + value
## Data: Grunfeld
##
## REML criterion at convergence: 2195.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.4319 -0.3498 0.0210 0.3592 4.8145
##
## Random effects:
## Groups Name Variance Std.Dev.
## firm (Intercept) 7367 85.83
## Residual 2781 52.74
## Number of obs: 200, groups: firm, 10
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -57.86442 29.37776 -1.97
## capital 0.30819 0.01717 17.95
## value 0.10979 0.01053 10.43
##
## Correlation of Fixed Effects:
## (Intr) capitl
## capital -0.019
## value -0.328 -0.368
## fit warnings:
## Some predictor variables are on very different scales: consider rescaling
See Wikipedia, the article is fairly comprehensive.
Common forms of the gls command are as follows (from
Pinheiro, Pinheiro, and Bates (2000)):
gls( model, data, correlation ) - correlated
errorsgls( model, data, weights ) - heteroscedastic
errorsgls( model, data, correlation, weights ) - bothWe can play around with the structure of covariance matrices with:
# In order to specify the correlation structure to be unstructured, we allow each group to be different for id. But for each time point to be related.
# ?corSymm
# corSymm(form=tlc$time | tlc$id)
# In order to see all the correlation matrices for just the first individual
corMatrix(Initialize(corSymm(form= ~ 1 | id), data=tlc))$`1`
## [,1] [,2] [,3] [,4]
## [1,] 1.000000e+00 6.934668e-310 6.934668e-310 6.934668e-310
## [2,] 6.934668e-310 1.000000e+00 6.934668e-310 6.934668e-310
## [3,] 6.934668e-310 6.934668e-310 1.000000e+00 6.934668e-310
## [4,] 6.934668e-310 6.934668e-310 6.934668e-310 1.000000e+00